Exploiting Non-Sequence Data in Dynamic Model Learning

نویسندگان

  • Tzu-Kuo Huang
  • Ziv Bar-Joseph
  • Geoffrey J. Gordon
  • Ali Shojaie
چکیده

Virtually all methods of learning dynamic models from data start from the same basic assumption: that the learning algorithm will be provided with a single or multiple sequences of data generated from the dynamic model. However, in quite a few modern time series modeling tasks, the collection of reliable time series data turns out to be a major challenge, due to either slow progression of the dynamic process of interest, or inaccessibility of repetitive measurements of the same dynamic process over time. In most of those situations, however, we observe that it is easier to collect a large amount of non-sequence samples, or random snapshots of the dynamic process of interest without time information. This thesis aims to exploit such non-sequence data in learning a few widely used dynamic models, including fully observable, linear and nonlinear models as well as Hidden Markov Models (HMMs). For fully observable models, we point out several issues on model identifiability when learning from non-sequence data, and develop EM-type learning algorithms based on maximizing approximate likelihood. We also consider the setting where a small amount of sequence data are available in addition to non-sequence data, and propose a novel penalized least square approach that uses non-sequence data to regularize the model. For HMMs, we draw inspiration from recent advances in spectral learning of latent variable models and propose spectral algorithms that provably recover the model parameters, under reasonable assumptions on the generative process of non-sequence data and the true model. To the best of our knowledge, this is the first formal guarantee on learning dynamic models from non-sequence data. We also consider the case where little sequence data are available, and propose learning algorithms that, as in the fully observable case, use non-sequence data to provide regularization, but does so in combination with spectral methods. Experiments on synthetic data and several real data sets, including gene expression and cell image time series, demonstrate the effectiveness of our proposed methods. In the last part of the thesis we return to the usual setting of learning from sequence data, and consider learning bi-clustered vector auto-regressive models, whose transition matrix is both sparse, revealing significant interactions among variables, and bi-clustered, identifying groups of variables that have similar interactions with other variables. Such structures may aid other learning tasks in the same domain that have abundant non-sequence data by providing better regularization in our proposed non-sequence methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Seismic Data Forecasting: A Sequence Prediction or a Sequence Recognition Task

In this paper, we have tried to predict earthquake events in a cluster of seismic data on pacific ring of fire, using multivariate adaptive regression splines (MARS). The model is employed as either a predictor for a sequence prediction task, or a binary classifier for a sequence recognition problem, which could alternatively help to predict an event. Here, we explain that sequence prediction/r...

متن کامل

Learning Hidden Markov Models from Non-sequence Data via Tensor Decomposition

Learning dynamic models from observed data has been a central issue in many scientific studies or engineering tasks. The usual setting is that data are collected sequentially from trajectories of some dynamical system operation. In quite a few modern scientific modeling tasks, however, it turns out that reliable sequential data are rather difficult to gather, whereas out-of-order snapshots are ...

متن کامل

On the Role of Dynamic Assessment on Promotion of Writing Linguistic Accuracy among EFL Learners: An Interventionist Model

This study is conducted under the domain of Vygotskian Socio-cultural Theory (SCT) of mind and the notion of dynamic assessment to elevate the linguistic accuracy of EFL learners’ writing skill. 40 homogenous intermediate EFL learners from four intact classes were divided into two dynamic assessment (DA) and non-dynamic assessment (NDA) groups. As a pre-test, the participants were given writing...

متن کامل

Non radial model of dynamic DEA with the parallel network structure

  In this article, Non radial method of dynamic DEA with the parallel network structure is presented and is used for calculation of relative efficiency measures when inputs and outputs do not change equally. In this model, DMU divisions under evaluation have been put together in parallel. But its dynamic structure is assumed in series. Since in real applications there are undesirable inputs an...

متن کامل

Model Metric Co-Learning for Time Series Classification

We present a novel model-metric co-learning (MMCL) methodology for sequence classification which learns in the model space – each data item (sequence) is represented by a predictive model from a carefully designed model class. MMCL learning encourages sequences from the same class to be represented by ‘close’ model representations, well separated from those for different classes. Existing appro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012